{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cookbook: Database Manipulation\n",
    "\n",
    "In this cookbook, we will explore a simple data processing example: reading a file, adding its data to a database, and querying it. It shows how we can efficiently manipulate data with agents."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Install Requirements\n",
    "\n",
    "We will use [Qwen-Agent](https://github.com/QwenLM/Qwen-Agent) in this book. For demonstration, we will need at least the `mcp` functionality of Qwen-Agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip3 install -U \"qwen-agent[gui,rag,code_interpreter,mcp]\"\n",
    "# `pip install -U qwen-agent` will install the minimal requirements.\n",
    "# The optional requirements, specified in double brackets, are:\n",
    "#   [gui] for Gradio-based GUI support;\n",
    "#   [rag] for RAG support;\n",
    "#   [code_interpreter] for Code Interpreter support;\n",
    "#   [mcp] for MCP support."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip3 install -U uv\n",
    "# We will use mcp servers that run with the `uvx` command, so `uv` is required."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!npm --version\n",
    "# We also use mcp servers that run with the `npx` command, so node.js should be installed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Create a Spreadsheet\n",
    "\n",
    "We create a sample spreadsheet and save it as `scores.csv` for later use."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import csv\n",
    "\n",
    "csv_file_name = 'scores.csv'\n",
    "table = [\n",
    "    {\"id\": 100, \"score\": 85},\n",
    "    {\"id\": 101, \"score\": 90},\n",
    "    {\"id\": 102, \"score\": 88},\n",
    "    {\"id\": 103, \"score\": 95},\n",
    "    {\"id\": 104, \"score\": 80},\n",
    "    {\"id\": 105, \"score\": 92},\n",
    "    {\"id\": 106, \"score\": 87},\n",
    "    {\"id\": 107, \"score\": 100},\n",
    "    {\"id\": 108, \"score\": 83},\n",
    "    {\"id\": 109, \"score\": 98}\n",
    "]\n",
    "\n",
    "with open(csv_file_name, mode='w', newline='') as csv_file:\n",
    "    writer = csv.DictWriter(csv_file, fieldnames=['id', 'score'])\n",
    "    writer.writeheader()\n",
    "    for record in table:\n",
    "        writer.writerow(record)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 1: Create an Agent\n",
    "\n",
    "Create an agent that is capable of \n",
    "- listing, reading and writing files (through the MCP server `@modelcontextprotocol/server-filesystem`).\n",
    "- performing SQLite database manipulations (through the MCP server `mcp-server-sqlite`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qwen_agent.agents import Assistant\n",
    "from qwen_agent.utils.output_beautify import typewriter_print\n",
    "# `typewriter_print` prints streaming messages in a non-overlapping manner for a clear view."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2025-05-01 21:07:26,268 - mcp_manager.py - 110 - INFO - Initializing MCP tools from mcp servers: ['filesystem', 'sqlite']\n",
      "2025-05-01 21:07:26,275 - mcp_manager.py - 245 - INFO - Initializing a MCP stdio_client, if this takes forever, please check the config of this mcp server: filesystem\n",
      "2025-05-01 21:07:27,440 - mcp_manager.py - 245 - INFO - Initializing a MCP stdio_client, if this takes forever, please check the config of this mcp server: sqlite\n"
     ]
    }
   ],
   "source": [
    "llm_cfg = {\n",
    "    'model': 'qwen3-32b',\n",
    "    'model_server': 'dashscope',\n",
    "    'api_key': '' # **fill your dashscope api key here**\n",
    "\n",
    "    # Use a model service compatible with the OpenAI API, such as vLLM or Ollama:\n",
    "    # 'model': 'Qwen3-8B',\n",
    "    # 'model_server': 'http://localhost:8000/v1',  # base_url, also known as api_base\n",
    "    # 'api_key': 'EMPTY'\n",
    "}\n",
    "\n",
    "tools = [\n",
    "    {\n",
    "        \"mcpServers\": {\n",
    "            # enumeration of mcp server configs\n",
    "            \"filesystem\": {\n",
    "                \"command\": \"npx\",\n",
    "                \"args\": [\n",
    "                    \"-y\",\n",
    "                    \"@modelcontextprotocol/server-filesystem\",\n",
    "                    '.',\n",
    "                ]\n",
    "            },\n",
    "            \"sqlite\" : {\n",
    "                \"command\": \"uvx\",\n",
    "                \"args\": [\n",
    "                    \"mcp-server-sqlite\",\n",
    "                    \"--db-path\",\n",
    "                    \"scores.db\"\n",
    "                ]\n",
    "            }\n",
    "        }\n",
    "    }\n",
    "]\n",
    "\n",
    "agent = Assistant(\n",
    "    llm=llm_cfg, \n",
    "    function_list=tools\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 2: Read the Spreadsheet\n",
    "\n",
    "The agent will read the csv table by calling the file reading tool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "messages = [\n",
    "    {\n",
    "        'role': 'user',\n",
    "        'content': 'Show me the contents in scores.csv.'\n",
    "    }\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[THINK]\n",
      "Okay, the user wants to see the contents of scores.csv. Let me check which tool to use. The available functions include reading a file, so I should use filesystem-read_file. The parameters require the path to the file. Assuming scores.csv is in an allowed directory, I'll call that function with the path \"scores.csv\". I need to make sure the path is correct and accessible. If there's an issue, the function will return an error, but I'll proceed with the call as instructed.\n",
      "[TOOL_CALL] filesystem-read_file\n",
      "{\"path\": \"scores.csv\"}\n",
      "[TOOL_RESPONSE] filesystem-read_file\n",
      "id,score\n",
      "100,85\n",
      "101,90\n",
      "102,88\n",
      "103,95\n",
      "104,80\n",
      "105,92\n",
      "106,87\n",
      "107,100\n",
      "108,83\n",
      "109,98\n",
      "\n",
      "[THINK]\n",
      "Okay, the user asked for the contents of scores.csv, and I used the filesystem-read_file tool to get it. The response shows the CSV data with id and score columns. Now I need to present this information clearly. Let me check if there's any specific format the user wants. Since they just said \"show me the contents,\" a simple display of the data should suffice. I'll format it as a table for readability, making sure to include the headers and all the rows provided. I should also mention that the data was retrieved successfully in case they need further actions.\n",
      "Here are the contents of `scores.csv`:\n",
      "\n",
      "| id  | score |\n",
      "|-----|-------|\n",
      "| 100 | 85    |\n",
      "| 101 | 90    |\n",
      "| 102 | 88    |\n",
      "| 103 | 95    |\n",
      "| 104 | 80    |\n",
      "| 105 | 92    |\n",
      "| 106 | 87    |\n",
      "| 107 | 100   |\n",
      "| 108 | 83    |\n",
      "| 109 | 98    |\n",
      "\n",
      "Let me know if you'd like to analyze this data further or perform any modifications!"
     ]
    }
   ],
   "source": [
    "response_plain_text = ''\n",
    "for ret_messages in agent.run(messages):\n",
    "    # `ret_messages` will contain all subsequent messages, consisting of interleaved assistant messages and tool responses\n",
    "    response_plain_text = typewriter_print(ret_messages, response_plain_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "messages += ret_messages # extending the context with new `ret_messages`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 3: Create a Database Table\n",
    "\n",
    "Next, we create a database table and insert all the data that the agent just read."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "messages.append({'role': 'user', 'content': \"Create a table in the database, and fill it with the contents in the previous table.\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[THINK]\n",
      "Okay, the user wants me to create a table in the database and fill it with the data from the scores.csv file I just showed. Let me think about how to approach this.\n",
      "\n",
      "First, I need to create a table. The data has two columns: id and score. Both are integers. So the SQL statement should define a table with those columns. Maybe name it 'scores' for simplicity. The SQL would be CREATE TABLE scores (id INTEGER, score INTEGER).\n",
      "\n",
      "Next, I need to insert the data. The user probably expects me to use the data from the CSV. Since I have the CSV contents, I can generate INSERT statements for each row. But the user might not want to do that manually. Wait, the tools available include sqlite-create_table and sqlite-write_query. So first, I should call sqlite-create_table with the CREATE TABLE statement. Then, for each row, execute an INSERT. But inserting one by one might be inefficient. Alternatively, can I do a multi-row insert? Like INSERT INTO scores (id, score) VALUES (100,85), (101,90), ... ?\n",
      "\n",
      "Yes, SQLite supports inserting multiple rows in a single statement. That would be more efficient. Let me check the data. The CSV has 10 rows. So the INSERT statement would have all those values. Let me construct that.\n",
      "\n",
      "So first, call sqlite-create_table with the CREATE TABLE query. Then, call sqlite-write_query with the INSERT statement containing all the rows. That should populate the table.\n",
      "\n",
      "Wait, but the user might not have the data in a file that's accessible to the database. Since the data is already in the CSV, which I read earlier, I need to generate the INSERT statements from that data. Since I have the data as a string, I can parse it. The CSV has headers id,score followed by the rows. So split the string by newlines, skip the first line, then process each line into id and score.\n",
      "\n",
      "Let me outline the steps:\n",
      "\n",
      "1. Create the table using CREATE TABLE.\n",
      "2. Prepare an INSERT statement with all the rows.\n",
      "3. Execute the INSERT using sqlite-write_query.\n",
      "\n",
      "So the first tool call is sqlite-create_table with the CREATE TABLE query. Then, the second tool call is sqlite-write_query with the INSERT statement.\n",
      "\n",
      "But I need to make sure that the SQL syntax is correct. For the INSERT, the values should be in parentheses separated by commas. Each value pair (id, score) is a row. So the INSERT would be:\n",
      "\n",
      "INSERT INTO scores (id, score) VALUES (100, 85), (101, 90), ..., (109, 98);\n",
      "\n",
      "Let me verify the data. The rows are:\n",
      "\n",
      "100,85\n",
      "101,90\n",
      "102,88\n",
      "103,95\n",
      "104,80\n",
      "105,92\n",
      "106,87\n",
      "107,100\n",
      "108,83\n",
      "109,98\n",
      "\n",
      "Yes, that's 10 rows. So the INSERT statement will have all these values.\n",
      "\n",
      "Now, I need to structure the tool calls. First, create the table. Then, insert the data. The user's instruction is to \"create a table in the database, and fill it with the contents in the previous table.\" So the answer should involve both steps.\n",
      "\n",
      "I should return the first tool call for creating the table, then the second for inserting the data. Let me check the available functions again. The sqlite-create_table function takes a CREATE TABLE query, and sqlite-write_query can handle INSERT statements.\n",
      "\n",
      "So the first tool call is:\n",
      "\n",
      "{\n",
      "  \"name\": \"sqlite-create_table\",\n",
      "  \"arguments\": {\n",
      "    \"query\": \"CREATE TABLE scores (id INTEGER, score INTEGER);\"\n",
      "  }\n",
      "}\n",
      "\n",
      "Then, the second tool call:\n",
      "\n",
      "{\n",
      "  \"name\": \"sqlite-write_query\",\n",
      "  \"arguments\": {\n",
      "    \"query\": \"INSERT INTO scores (id, score) VALUES (100, 85), (101, 90), (102, 88), (103, 95), (104, 80), (105, 92), (106, 87), (107, 100), (108, 83), (109, 98);\"\n",
      "  }\n",
      "}\n",
      "\n",
      "That should do it. The user will have the table created and populated with the data from the CSV file.\n",
      "[TOOL_CALL] sqlite-create_table\n",
      "{\"query\": \"CREATE TABLE scores (id INTEGER, score INTEGER);\"}\n",
      "[TOOL_RESPONSE] sqlite-create_table\n",
      "Table created successfully\n",
      "[THINK]\n",
      "Okay, the user wants to create a table in the database and fill it with the data from the scores.csv file. I already created the table using the sqlite-create_table function. Now, I need to insert the data into the table.\n",
      "\n",
      "Looking at the data from the CSV, there are 10 rows with id and score columns. The best way to insert all these rows efficiently is to use a single INSERT statement with multiple VALUES clauses. That way, I can add all the data in one go instead of making multiple function calls.\n",
      "\n",
      "So, I'll construct an INSERT INTO scores (id, score) VALUES ... query with all the rows. Each row is separated by a comma. Let me list them out: (100,85), (101,90), and so on until (109,98). I need to make sure the syntax is correct, with each value in parentheses and separated by commas. Also, check that the number of rows matches the data provided. Once that's done, I'll use the sqlite-write_query function to execute the INSERT statement. This should populate the table with all the data from the CSV file.\n",
      "[TOOL_CALL] sqlite-write_query\n",
      "{\"query\": \"INSERT INTO scores (id, score) VALUES (100, 85), (101, 90), (102, 88), (103, 95), (104, 80), (105, 92), (106, 87), (107, 100), (108, 83), (109, 98);\"}\n",
      "[TOOL_RESPONSE] sqlite-write_query\n",
      "[{'affected_rows': 10}]\n",
      "[THINK]\n",
      "Okay, the user wanted to create a table and fill it with the data from the scores.csv file. I first used the sqlite-create_table function to create the 'scores' table with id and score columns. Then, I generated an INSERT statement with all the rows from the CSV and executed it using sqlite-write_query. The response showed that 10 rows were affected, which means the data was successfully inserted. Now, I should confirm that the table was created and the data is there. Maybe I can run a SELECT query to check, but the user didn't ask for that. I'll just let them know it's done and offer further help if needed.\n",
      "The table has been successfully created and populated with the 10 records from your CSV file. The database now contains a `scores` table with the following structure:\n",
      "\n",
      "```sql\n",
      "CREATE TABLE scores (\n",
      "    id INTEGER,\n",
      "    score INTEGER\n",
      ");\n",
      "```\n",
      "\n",
      "With these records inserted:\n",
      "- 100, 85\n",
      "- 101, 90\n",
      "- 102, 88\n",
      "- 103, 95\n",
      "- 104, 80\n",
      "- 105, 92\n",
      "- 106, 87\n",
      "- 107, 100\n",
      "- 108, 83\n",
      "- 109, 98\n",
      "\n",
      "Would you like to perform any analysis or operations on this data?"
     ]
    }
   ],
   "source": [
    "response_plain_text = ''\n",
    "\n",
    "for ret_messages in agent.run(messages):\n",
    "    response_plain_text = typewriter_print(ret_messages, response_plain_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "messages += ret_messages"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 4: Query the Highest Score\n",
    "\n",
    "We query the highest score in the database. You can also perform other queries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "messages += [{'role': 'user', 'content': 'Query the highest score from the database.'}]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[THINK]\n",
      "Okay, the user wants to query the highest score from the database. Let me think about how to approach this.\n",
      "\n",
      "First, I need to recall the structure of the database. The user created a table called 'scores' with columns 'id' and 'score'. The data has been populated with 10 entries. So, the table exists and has the necessary data.\n",
      "\n",
      "The goal is to find the highest score. In SQL, the standard way to get the maximum value of a column is using the MAX() function. So the query should select the maximum value from the 'score' column in the 'scores' table.\n",
      "\n",
      "The correct SQL query would be: SELECT MAX(score) FROM scores;\n",
      "\n",
      "I need to use the sqlite-read_query function because this is a SELECT statement. The user might expect just the highest score as a result. Let me make sure there are no typos in the query. Also, since the table name and column names are correctly specified, there shouldn't be any syntax errors.\n",
      "\n",
      "Once the query is executed, the result should return a single row with the maximum score, which in the given data is 100. I should present the result clearly to the user, confirming that it's the highest score from the database.\n",
      "[TOOL_CALL] sqlite-read_query\n",
      "{\"query\": \"SELECT MAX(score) AS highest_score FROM scores;\"}\n",
      "[TOOL_RESPONSE] sqlite-read_query\n",
      "[{'highest_score': 100}]\n",
      "[THINK]\n",
      "Okay, let me see. The user asked for the highest score from the database. I need to write an SQL query to get that. The table is called 'scores' and the column is 'score'. The function to use here is MAX(), right? So the query would be SELECT MAX(score) FROM scores;. Then, I should use the sqlite-read_query tool because it's a SELECT statement. Let me make sure the syntax is correct. Oh, maybe add an alias like AS highest_score to make the result clearer. That should return the maximum value in the score column. The previous steps showed that the data was inserted correctly, so the highest score should be 100. Let me check if there are any possible errors, but the query seems straightforward. Alright, time to execute it.\n",
      "The highest score in the database is **100**. \n",
      "\n",
      "This result was obtained by querying the maximum value from the `score` column in the `scores` table using the SQL function `MAX(score)`."
     ]
    }
   ],
   "source": [
    "response_plain_text = ''\n",
    "\n",
    "for ret_messages in agent.run(messages):\n",
    "    response_plain_text = typewriter_print(ret_messages, response_plain_text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Use the GUI\n",
    "\n",
    "We have explored the capabilities of the Qwen-Agent framework and Qwen models for data manipulations with databases.\n",
    "\n",
    "Even better, we can easily create a web UI for an agent!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2025-05-01 21:09:02,227 - mcp_manager.py - 110 - INFO - Initializing MCP tools from mcp servers: ['filesystem', 'sqlite']\n",
      "2025-05-01 21:09:02,233 - mcp_manager.py - 245 - INFO - Initializing a MCP stdio_client, if this takes forever, please check the config of this mcp server: filesystem\n",
      "2025-05-01 21:09:03,425 - mcp_manager.py - 245 - INFO - Initializing a MCP stdio_client, if this takes forever, please check the config of this mcp server: sqlite\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "* Running on local URL:  http://127.0.0.1:7860\n",
      "\n",
      "To create a public link, set `share=True` in `launch()`.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div><iframe src=\"http://127.0.0.1:7860/\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from qwen_agent.gui import WebUI\n",
    "\n",
    "agent = Assistant(\n",
    "    name=\"Qwen Assistant\",\n",
    "    description=\"I'm a digital assistant powered by Qwen-Agent, ask me anything!\",\n",
    "    llm=llm_cfg,\n",
    "    function_list=tools\n",
    ")\n",
    "\n",
    "WebUI(agent).run()\n",
    "# Have fun!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
